In  Silico  Design  of  Smart  Binders  to  Anthrax  PA 


by  Michael  S.  Sellers  and  Margaret  M.  Hurley 


ARL-RP-0399  September  2012 


A  reprint  from  Proceedings  of  SPIE 


Approved  for  public  release;  distribution  unlimited. 


NOTICES 

Disclaimers 

The  findings  in  this  report  are  not  to  be  construed  as  an  official  Department  of  the  Army  position 
unless  so  designated  by  other  authorized  documents. 

Citation  of  manufacturer’s  or  trade  names  does  not  constitute  an  official  endorsement  or 
approval  of  the  use  thereof. 


Destroy  this  report  when  it  is  no  longer  needed.  Do  not  return  it  to  the  originator. 


Army  Research  Laboratory 

Aberdeen  Proving  Ground,  MD  21005 


ARL-RP-0399 


September  2012 


In  Silico  Design  of  Smart  Binders  to  Anthrax  PA 

Michael  S.  Sellers  and  Margaret  M.  Hurley 
Weapons  and  Materials  Research  Directorate,  ARL 

A  reprint  from  Proceedings  of  SPIE 


Approved  for  public  release;  distribution  unlimited. 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OMB  No.  0704-0188 


Public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and  maintaining  the 
data  needed,  and  completing  and  reviewing  the  collection  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information,  including  suggestions  for  reducing  the 
burden,  to  Department  of  Defense,  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports  (0704-0188),  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington,  VA  22202-4302. 
Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  any  penalty  for  failing  to  comply  with  a  collection  of  information  if  it  does  not  display  a  currently  valid 
OMB  control  number. 

PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ADDRESS. 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited. 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

The  development  of  smart  peptide  binders  requires  an  understanding  of  the  fundamental  mechanisms  of  recognition  which  has 
remained  an  elusive  grail  of  the  research  community  for  decades.  Recent  advances  in  automated  discovery  and  synthetic  library 
science  provide  a  wealth  of  information  to  probe  fundamental  details  of  binding  and  facilitate  the  development  of  improved 
models  for  a  priori  prediction  of  affinity  and  specificity.  Here  we  present  the  modeling  portion  of  an  iterative 
experimental/computational  study  to  produce  high  affinity  peptide  binders  to  the  Protective  Antigen  (PA)  of  Bacillus  anthracis. 
The  result  is  a  general  usage,  HPC-oriented,  python-based  toolkit  based  upon  powerful  third-party  freeware,  which  is  designed 
to  provide  a  better  understanding  of  peptide -protein  interactions  and  ultimately  predict  and  measure  new  smart  peptide  binder 
candidates.  We  present  an  improved  simulation  protocol  with  flexible  peptide  docking  to  the  Anthrax  Protective  Antigen, 
reported  within  the  context  of  experimental  data  presented  in  a  companion  work. 

15.  SUBJECT  TERMS 

Docking,  molecular  recognition,  peptide  binders 

17.  LIMITATION  18.  NUMBER 
OF  ABSTRACT  OF  PAGES 

c.  THIS  PAGE 

UNCLASSIFIED  UU  14 

Standard  Form  298  (Rev.  8/98) 
Prescribed  by  ANSI  Std.  Z39.18 


19a.  NAME  OF  RESPONSIBLE  PERSON 

Michael  S.  Sellers 


19b.  TELEPHONE  NUMBER  (. Include  area  code ) 

(401)  306-1905 


16.  SECURITY  CLASSIFICATION  OF 


a.  REPORT  b.  ABSTRACT 

UNCLASSIFIED  UNCLASSIFIE 


1.  REPORT  DATE  (DD-MM-YYYY)  2.  REPORT  TYPE 

September  2012 

4.  TITLE  AND  SUBTITLE 

In  Silico  Design  of  Smart  Binders  to  Anthrax  PA 


6.  AUTHOR(S) 

Michael  S.  Sellers  and  Margaret  M.  Hurley 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

U.S.  Army  Research  Laboratory 
ATTN:  RDRL-WML-B 
Aberdeen  Proving  Ground,  MD  21005 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 


3.  DATES  COVERED  (From  -  To) 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

ARL-RP-0399 

10.  SPONSOR/MONITOR’S  ACRONYM(S) 

11.  SPONSOR/MONITOR  S  REPORT 
NUMBER(S) 


11 


In  Silico  Design  of  Smart  Binders  to  Anthrax  PA 

Michael  Sellers*a,  Margaret  M.  Hurleya, 

U.S.  Army  Research  Laboratory,  Weapons  and  Materials  Research  Directorate,  Aberdeen  Proving  Ground,  MD  21005 


ABSTRACT 

The  development  of  smart  peptide  binders  requires  an  understanding  of  the  fundamental  mechanisms  of  recognition 
which  has  remained  an  elusive  grail  of  the  research  community  for  decades.  Recent  advances  in  automated  discovery  and 
synthetic  library  science  provide  a  wealth  of  information  to  probe  fundamental  details  of  binding  and  facilitate  the 
development  of  improved  models  for  a  priori  prediction  of  affinity  and  specificity.  Here  we  present  the  modeling  portion 
of  an  iterative  experimental/computational  study  to  produce  high  affinity  peptide  binders  to  the  Protective  Antigen  (PA) 
of  Bacillus  anthracis.  The  result  is  a  general  usage,  HPC-oriented,  python-based  toolkit  based  upon  powerful  third-party 
freeware,  which  is  designed  to  provide  a  better  understanding  of  peptide-protein  interactions  and  ultimately  predict  and 
measure  new  smart  peptide  binder  candidates.  We  present  an  improved  simulation  protocol  with  flexible  peptide  docking 
to  the  Anthrax  Protective  Antigen,  reported  within  the  context  of  experimental  data  presented  in  a  companion  work. 
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1.  INTRODUCTION 

The  core  of  a  biosensor  is  a  recognition  element,  carefully  chosen  to  react  with  a  target  analyte  such  as  an  antigen  or 
contaminant.  While  biological  recognition  elements  are  common,  the  lure  of  improved  characteristics  (increased 
stability,  better  affinity,  improved  specificity)  makes  synthetic  recognition  elements  such  as  aptamers  and  peptides  an 
intriguing  alternative.  With  this  in  mind,  researchers  at  U.S.  Army’s  Army  Research  Laboratory  and  Edgewood 
Chemical  and  Biological  Center  are  endeavoring  to  design  smart  peptide  binders  for  enhanced  affinity  and  specificity  to 
the  anthrax  ( Bacillus  anthracis )  protective  antigen. 

A  fundamental  understanding  of  molecular  recognition  is  critical  to  the  design  of  improved  synthetic  biosensors.  This 
study  utilizes  a  combined  experimental  and  computational  approach  to  probe  the  details  of  the  binding.  By  utilizing  an 
iterative  process  to  successively  predict,  analyze,  and  retune  the  model,  we  hope  to  develop  an  improved  computational 
protocol  which  correctly  reproduces  the  essential  physical  elements  of  the  peptide-protein  interaction  and  is  capable  of 
reliable  a  priori  prediction  of  peptide  binding  affinity  and  specificity. 

With  this  end  in  mind,  the  computational  side  of  this  study  faces  a  significant  challenge.  The  problem  of  finding  the 
natural  orientation  of  two  complexed  moieties  is  known  as  “docking”  in  the  modeling  community.  Over  the  last  two 
decades,  the  docking  problem  has  spurred  considerable  interest  and  software  development,  and  researchers  have  worked 
to  overcome  the  high  number  of  rotational  and  translation  degrees  of  freedom  associated  with  the  binding  of  two 
molecular  partners.  This  is  compounded  by  the  fact  that  early  docking  algorithms  were  developed  based  on  the  classic 
enzyme-substrate  lock-and-key  motif.  When  the  aggregation  of  two  larger  biological  entities  is  involved,  such  as  a 
protein-protein  or  peptide-protein  complex,  the  resulting  image  is  actually  more  similar  to  the  association  of  two  balls  of 
yam. 

A  prodigious  number  of  docking  codes  is  available  to  the  interested  user,  with  varying  strengths  and  weaknesses  which 
are  carefully  analyzed  by  the  community  in  open  competition  such  as  the  CAPRI  (Critical  Assessment  of  Prediction  of 
Interactions)  competition.  [1]  It  is  beyond  the  scope  of  this  paper  to  provide  a  detailed  review  of  docking  software,  and 
we  restrict  ourselves  to  codes  covered  within  the  scope  of  this  project,  such  as  GRAMM-X[2]  and  PTOOLS[3]. 
Ultimately  the  Rosetta  Software  Suite  was  utilized  as  the  core  docking  component  of  our  toolkit.  [4]  The  powerful  suite  is 
a  multi-group  supported  effort,  with  numerous  capabilities  including  improved  a  conformational  sampling  for  both 
*Michael.s.sellers9. ctr@mail.mil  phone  1  410  306-1905  fax  1  410  306-1909 


Chemical,  Biological,  Radiological,  Nuclear,  and  Explosives  (CBRNE)  Sensing  XIII,  edited  by  Augustus  Way  Fountain  III, 
Proc.  of  SPIE  Vol.  8358,  835807  ■  ©2012  SPIE  ■  CCC  code:  0277-786X/12/$18  ■  doi:  10.1117/12.918424 

Proc.  of  SPIE  Vol.  8358  835807-1 


Downloaded  from  SPIE  Digital  Library  on  28  Jun  2012  to  128.63.163.35.  Terms  of  Use:  http://spiedl.org/terms 


sidechain  and  backbone.  This  is  of  interest,  as  small  proteins  and  short  peptide  chains  that  infrequently  hold  tertiary 
structure  are  difficult  to  dock  via  current  methods.  As  delineated  in  several  recent  works  by  Baker  and  Gray,  it  is  vital  to 
the  accuracy  of  the  result  to  include  both  the  flexibility  of  these  peptides  and  possible  changes  to  the  receptor  protein.  [5] 
Additionally,  the  standard  description  of  energetic  interactions  between  target  and  receptor  are  not  always  transferable  to 
such  highly  flexible  partners,  and  merit  further  scrutiny.  [6] 

Accordingly,  we  have  attempted  to  improve  the  treatment  of  flexibility,  as  well  as  improved  treatment  of  additional 
issues  such  as  hydrogen  bonding  and  solvation,  through  the  combination  of  the  multiple  computational  methods  into  an 
HPC-ready,  extensible,  free,  Python-based  simulation  toolkit  .  This  preliminary  work  focuses  on  the  initial  framework 
combining  the  Rosetta  Software  Suite  via  the  PyRosetta  script-based  interface [7]  and  the  NAMD  molecular  dynamics 
simulation  package  developed  by  the  Theoretical  and  Computational  Biophysics  Group  in  the  Beckman  Institute  for 
Advanced  Science  and  Technology  at  the  University  of  Illinois  at  Urbana-Champaign[8].  The  current  combined 
computational/experimental  effort  of  designing  smart  peptide  binders  to  the  Anthrax  protective  antigen  provides  an 
excellent  case  study  for  the  development  of  this  toolkit.  In  the  Methodology  section,  we  briefly  describe  the  peptides 
under  analysis  and  the  Anthrax  protective  antigen  protein,  and  provide  an  outline  of  the  computational  protocol,  which  is 
dubbed  the  XPairlt  toolkit.  Our  flexibility  study  and  docking  results  are  presented  in  Results ,  and  we  identify  possible 
binding  locations  of  the  peptide.  In  Conclusions ,  we  discuss  our  docking  results  and  present  a  path  for  future  work. 


2.  METHODOLOGY 

The  XPairlt  toolkit  also  interfaces  with  the  APBS  simulator  for  improved  electrostatics [9],  the  STRIDE  secondary 
structure  prediction  tool  to  assess  structural  changes  during  dynamics  and  on  binding[10],  and  the  PSFGen  structure 
builder  included  with  VMDand  NAMD.  A  discussion  of  the  XPairlt  protocol  for  global  and  focused  docking  is 
described  below.  A  more  complete  review  of  the  protocol  is  in  preparation. 

2.1  Target  Analyte-Recognition  Element  System 

The  starting  structure  of  the  target  analyte,  the  Anthrax  protective  antigen  (PA)  is  derived  from  the  experimental 
structure  of  Petosa  et  al  [12]  available  in  the  RCSB  PDB  (1  ACC) [13].  The  1ACC  structure  is  missing  residues,  and 
significant  work  was  done  to  return  the  protein  to  its  soluble  form.  We  direct  the  reader  to  the  aforementioned  future 
work  for  a  detailed  description  of  the  process.  Following  this,  the  PA  protein  structure  is  solvated  and  prepared  for 
docking  with  a  6  nanosecond(^)  molecular  dynamics  simulation  in  the  NPT  ensemble  (constant  particle  number, 
pressure,  and  temperature)  at  300K,  with  the  CHARMM  force-field[14]  and  TIP3  water.  The  resulting  structure  at  6ns  is 
then  minimized. 

The  recognition  element,  a  15-mer  peptide  developed  by  experimental  collaborators  within  the  context  of  this  work  and 
here  referred  to  as  DS23,  is  constructed  within  VMD  and  prepared  in  a  fashion  similar  to  the  PA..  Representative 
structures  from  this  trajectory  are  saved  every  10  ps.  The  minimized  protein  structure  and  ensemble  of  DS23  structures 
are  then  fed  into  our  docking  algorithm.  This  process  is  depicted  in  Figure  1. 

2.2  Flexibility-Enhanced  Docking  Scheme 

Our  docking  procedure  is  two-level.  At  the  first-level,  a  DS  structure  is  randomly  chosen  from  the  dynamics  trajectory 
described  in  section  2.1,  and  paired  with  the  minimized  PA  protein.  The  two  structures  are  then  randomly  rotated  around 
their  respective  centers  of  mass.  Next,  they  are  moved  along  the  vector  between  their  centers  of  mass  until  they  are  in 
contact  (at  -4.0A),  illustrated  in  Figure  1-B.  This  is  repeated  2000  times  to  create  a  set  of  random  starting  structures. 

Once  this  first-level  set  is  accumulated,  each  of  these  structures  forms  the  preliminary  structure  for  20  individual  second- 
level  simulations.  Each  second-level  docking  simulation  begins  by  randomly  rotating  the  peptide  around  its  center  of 
mass.  This  rotation  is  in  addition  to  the  first-level  rotation,  to  ensure  sufficient  sampling  of  the  peptide-protein  surface 
interaction.  Next,  RosettaDock  software’s  main  docking  method  is  used  to  randomly  perturb  the  peptide  structure  via 
small  translations  and  rotations  of  the  entire  molecule,  and  an  overall  energy  is  computed  after  each  move.  The  positions 
of  both  DS23  and  PA  sidechains  are  also  perturbed  through  sampling  of  Rosetta ’s  rotamer  library,  a  process  referred  to 
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as  “repacking.”[15]  After  a  complete  series  of  trial  perturbations  is  performed,  the  resulting  structure  with  the  lowest 
energy  is  saved.  This  produces  a  set  of  40,000  second-level  structures,  providing  a  somewhat  static  perspective  of  the 
dominant  interactions  in  the  complex. 

To  improve  the  treatment  of  flexibility  of  both  partners,  and  to  account  for  relatively  small  scale  induced  fit 
conformational  changes,  a  1.5  ps  molecular  dynamics  run  at  300K  in  implicit  water  is  then  performed  on  each  second- 
level  docked  structure.  The  Generalized  Bom  Implicit  Solvation(  GBIS)  model[16]  as  implemented  within  NAMD[17] 
is  used.  After  the  dynamics  is  complete,  the  stmcture  is  again  minimized  with  NAMD ,  and  the  sidechains  are  “repacked” 
and  re-minimized  in  Rosetta.  Data  from  this  entire  process  (energies,  hydrogen  bonds,  and  overall  changes  in  positions) 
at  various  stages  are  recorded  and  reported  in  Results. 

With  this  flexibility  enhanced  docking  scheme,  we  generate  40,000  configurations  of  DS23+PA,  which  are  then  ranked 
and  analyzed. 


Figure  2.1  (A)  Stmcture  preparation  procedure.  Peptide  (DS23)  and  protein  (1ACC~>  PA)  are  simulated  separately  with  molecular  dynamics  for  6 ns 
in  TIP3  water  with  CHARMM.  Peptide  snapshots  are  saved  every  \Qps.  Protein  is  subsequently  minimized.  (B)  Initial  docking  mn  setup.  Stmctures 
are  rotated  around  respective  centers  of  mass  and  placed  in  contact. 
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2.3  Docking  Analysis 

Comparison  of  energetics  and  structural  clustering  are  used  to  parse  the  resulting  40,000  docked  configurations  to 
determine  the  single  structure  (or  small  set  of  structures)  that  is  most  representative  of  the  natural  complex  of  our  DS23 
peptide  and  PA  protein.  In  the  docking  community,  computing  the  energy  of  a  structure  for  the  purpose  of  comparison  to 
other,  similar  structures  is  known  as  “scoring.”  Here,  each  docked  structure  is  scored  with  Rosetta ’s  internal  energy 
function  score  1 2.  [15]  However,  simple  energetic  comparison  often  does  not  provide  a  clear  indication  of  the  best  docked 
structure — the  top  10  structures  may  have  very  similar  scores,  certainly  all  within  the  variability  associated  with  thermal 
fluctuations.  Additional  information  must  be  used  for  further  refinement. 

Accordingly,  docked  structures  are  also  ranked  by  their  interface  energy ,  computed  with  Rosetta ’s  scorel2  function,  and 
the  top  25  structures  are  clustered  by  peptide  binding  location  over  the  protein  surface.  This  process  is  demonstrated 
within  Results.  The  interface  energy  is  defined  as: 

E  Interface  ^  Total  ^  Peptide  ^Pr  otein 


3.  RESULTS 

Using  Rosetta  docking  methods  and  NAMD  molecular  dynamics  within  the  XPairlt  toolkit,  we  generated  40,000  bound 
configurations  of  our  DS23  peptide  and  the  Anthrax  protective  antigen  (PA).  Before  the  docking  runs,  the  DS23  peptide 
was  simulated  using  molecular  dynamics  to  generate  an  ensemble  of  structures  exhibiting  the  molecule’s  flexibility. 
Randomly  selected  structures  from  this  ensemble  were  then  combined  with  a  full  PA  protein  generated  from  RCSB’s 
1  ACC  structure,  and  docked  according  to  the  scheme  outlined  in  Methodology.  Results  of  these  docking  simulations  are 
analyzed  below,  and  we  identify  key  residues  on  the  DS23  involved  in  binding,  as  well  as  specify  a  binding  location  on 
the  PA. 

3.1  Analysis  of  Initial  DS23  Dynamics 

We  briefly  provide  an  analysis  of  approximately  22ns  of  dynamics  (after  equilibration)  of  the  DS23  peptide  in  TIP3 
water  at  300K.  This  was  performed  to  assess  the  native  fluctuation  of  the  backbone  structure  of  the  peptide,  to  assess  the 
suitability  of  a  6ns  trajectory  for  capturing  essential  backbone  motions,  and  to  provide  a  basis  of  comparison  for 
structural  changes  induced  upon  binding  to  the  PA.  A  range  of  properties  are  used  to  monitor  this  behavior,  including  the 
radius  of  gyration,  percent  helicity ,  and  a  three-residue  sequence  helicity  parameter  discussed  by  Speranskiy  et  al.[18], 
which  provides  a  measure  of  a  more  extended  helical  character.  After  equilibration,  the  radius  of  gyration  of  this 
compound  fluctuates  about  an  average  value  of  7.2  A.  This  is  slightly  less  than  the  value  of  8. 72 A  which  would  be 
attributed  to  the  same  peptide  sequence  in  a  perfect  helical  configuration,  and  markedly  less  than  the  14. 8 A  radius  of 
gyration  of  the  beta  configuration,  or  linear  value  of  16.1  A.  It  is  evident  that  the  structure  of  the  peptide  in  solvent  is 
quite  uniformly  compact,  although  not  globular.  The  end-to-end  distance  similarly  fluctuates  regularly  about  an  average 
of  15.7  A. 


i- 


■5  3 
£ 


10  12 
Time  (ns) 


Figure  3.1.1  Dynamic  trends  of  radius  of  gyration  of  solvated  DS23  vs.  simulation  time  (ns). 
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Within  the  course  of  this  trajectory,  the  percent  helicity  (calculated  with  STRIDE)  fluctuates  continually  and  regularly 
between  0.0  and  66%.  The  overall  average  percent  helicity  through  the  course  of  the  simulation  is  34%  with  a  standard 
deviation  of  13%.  It  is  apparent  from  the  stability  of  the  radius  of  gyration  value,  however,  that  there  is  no  extension  of 
the  protein  backbone  or  uncoiling  involved  in  this  process,  and  this  is  also  evident  from  visualization  of  the  trajectory. 
The  three-residue  sequence  parameter  of  Speranskiy,  which  we  shall  abbreviate  H3 ,  fluctuates  regularly  between  0  and  5 
throughout  the  same  timeframe,  with  a  final  average  value  of  3.04.  It  is  interpreted  that  the  helical  nature  of  the  peptide  is 
clustered  in  an  approximate  5-residue  sequence,  which  upon  closer  examination  is  seen  to  exist  roughly  in  the  middle  of 
the  peptide. 

3.2  Ranking  of  Docked  Structures  and  Structural  Similarities 

Docked  structures  of  DS23+PA  were  ranked  in  two  rounds,  in  conjunction  with  the  two-level  docking  scheme  outlined 
in  Methodology.  The  20  docked  structures  generated  from  each  second-level  simulation  procedure  were  first  ranked  by 
total  score  (ETOtal  in  Equation  1).  This  generated  a  collection  of  2000  top  scoring  docked  structures,  clustered  over 
various  locations  on  PA.  Next,  the  structures  were  ranked  by  their  interface  energy ,  as  a  means  to  identify  highly 
favorable  regions  of  contact  on  the  PA.  Histograms  of  this  second  ranking  are  shown  in  Figure  3.2.1 — here,  each  bin 
corresponds  to  a  residue  on  the  PA,  and  a  contact  distance  threshold  of  4.0  A  is  used  to  generate  the  histogram.  In 
Figure  3. 2.1 -LEFT,  a  contact  histogram  is  shown  for  the  top  25  structures  ranked  by  interface  energy.  PA  residues  300- 
320  show  the  highest  degree  of  contact.  The  distribution  of  peaks  throughout  the  contact  map  provides  evidence  that 
other  potential  binding  locations  exist  around  the  PA.  By  narrowing  the  analysis  to  the  top  5  structures,  shown  in  Figure 
3. 2.1 -RIGHT,  a  handful  of  residues  on  the  PA  emerge  as  clear  points  of  contact  for  the  DS23  peptide. 

Upon  visualization  of  the  top  5  structures,  we  found  that  the  prospective  areas  of  contact  on  the  PA  can  be  further 
isolated.  For  example,  the  peak  near  residue  550  in  Figure  3. 2.1 -RIGHT  also  corresponds  to  peptide  contact  near 
residues  300-320  which  are  in  very  close  proximity  on  the  surface  of  the  PA  protein.  This  leads  us  to  the  identification 
of  three  spots  on  the  PA  that  have  a  high  affinity  for  binding  with  the  DS23  peptide:  (1)  residues  300-320:  back  loop  of 
domain  2  (2)  residues  415-425:  domain  2  and  (3)  residues  680  to  695:  bottom  of  domain  4.  Of  these  locations,  in  3  of 
the  5  top  docked  structures,  the  DS23  peptide  was  located  near  (1),  the  back  loop  of  domain  2.  Additionally,  the  DS23 
in  the  top  scoring  docked  structure  was  located  near  (1). 


Figure  3.2.1  LEFT  Contact  histogram  of  top  25  docked  structures  based  on  interface  energy.  RIGHT  Contact  histogram  of  top  5  docked  structures 
based  on  interface  energy.  Residues  on  the  PA[A]  which  DS23[X]  is  in  contact  with  are  indicated  on  the  abscissa,  frequency  on  the  ordinate. 


3.3  Key  Residues  on  DS23  Peptide 

We  then  performed  a  similar  contact  analysis  of  the  DS23  moiety  for  the  top  5  docked  structures  as  ranked  by  interface 
energy.  Shown  in  Figure  3.3.1  is  a  contact  histogram  identifying  residues  on  DS23  that  make  frequent  contact  with  the 
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PA  in  our  docked  structures.  Analysis  of  the  residues  with  highest  contact  frequency  showed  that  they  are  largely  located 
mid-chain,  hydrophobic,  and  planar. 

Further  study  of  the  single  top  docked  structure  showed  a  similar  trend.  In  Figure  3.3.2 ,  the  peptide  contact  histogram  of 
the  single  top  docked  structure  after  an  additional  short  molecular  dynamics  simulation  is  shown.  Residues  6  and  7  show 
persistent  contact  with  the  PA,  as  do  residues  1,10,  and  12. 


Residue  on  X 


Figure  3.3.1  Contact  histogram  of  top  5  docked  structures  based  on  interface  energy. 
Residues  on  DS23  [X]  which  PA  [A]  is  in  contact  with  are  indicated  on  the  abscissa, 
frequency  on  the  ordinate. 


Residue  on  X 


Figure  3.3.2  Contact  histogram  of  top  docked  structure  based  on  interface  energy ,  after 
a  3 ns  molecular  dynamics  run.  Residues  on  DS23  [X]  which  PA  [A]  is  in  contact  with 
are  indicated  on  the  abscissa,  frequency  on  the  ordinate. 
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3.4  Metrics  of  Flexibility-Enhanced  Docking  Scheme 


Improvement  in  the  score  of  overall  docking  results  using  a  flexibility-enhanced  scheme  is  shown  by  comparing  average 
score  per  structure,  total  number  of  hydrogen  bonds  in  all  analyzed  structures,  and  amount  of  structure  movement  of  the 
improved  docking  protocol,  in  comparison  with  the  basic  Rosetta  docking  method.  Shown  in  Table  3.4.1  are 
comparisons  of  these  two  stages  of  our  docking  scheme.  Inclusion  of  molecular  dynamics  and  minimization  within  the 
protocol  improves  the  interface  score  by  almost  5  Rosetta  Units  and  greatly  increases  the  total  number  of  hydrogen 
bonds.  Considering  RMSD/structure  (Root  Mean  Square  Displacement  per  atom,  per  structure),  we  do  not  see  a  large 
amount  of  structural  change  when  using  molecular  dynamics  after  Rosetta  docking.  We  note  that  this  RMSD  does  not 
take  into  account  any  large  scale  fluctuations  from  the  initial  dynamics  trajectory  sampling 

Table  3.4.1.  Toolkit  (T)  vs.  Standard  Docking  (D)  metrics  for  top  50  structures. 

Average  score/structure  ( Rosetta  Units)  Total  Hydrogen  Bonds  RMSD/structure  from  (D)  to  (T)  (A) 


Interface  Score  -8.07(T)  vs.  -3.30  (D)  53(T)  vs.  2(D) 


0.69 


4.  CONCLUSIONS 


TheXPairlt  protocol  was  developed  to  improve  docking  of  flexible  biological  partners  such  as  peptides  and  proteins. 
The  premier  case  study  for  the  software  is  designing  a  peptide  binder  for  the  Anthrax  protective  antigen.  Preliminary 
docking  results  are  complete  and  we  continue  to  collaborate  with  ongoing  experimental  work  through  an  iterative 
approach,  allowing  for  development  and  refinement  of  our  software  toolkit. 

Structure  and  function  of  several  patches  on  the  PA  offer  an  explanation  of  these  preliminary  docking  results.  Our  top 
scoring  and  most  frequent  docked  structure  was  that  of  the  DS23  in  contact  with  the  PA’s  domain  2  loop.  This  highly 
flexible  chain  on  the  PA  plays  a  large  role  in  membrane  insertion,  and  contains  several  hydrophobic,  aromatic  residues— 
specifically,  two  PHEs  that  are  exposed  to  the  solvent.  [12]  This  loop  presents  a  large  surface  area  for  the  binding  of 
several  hydrophobic  residues  of  the  DS23  peptide,  and  docked  structures  here  exhibit  the  best  interface  energy — a 
barrier  for  desorption  of  the  peptide.  This  matching  of  hydrophobic  patches  is  illustrated  in  Figure  4.1.1  below. 
Additional  preferred  locations  for  DS23  binding  appeared  in  an  alternative  section  of  PA’s  domain  2  and  at  the  base  of 
domain  4.  It  is  noteworthy  that  these  locations  also  display  an  appreciable  hydrophobic  surface.  Interestingly,  the 
putative  domain  4  binding  location,  located  completely  a  priori  from  this  simulation,  is  in  fact  an  active  binding  site  to 
the  Ml  8  antibody  and  CMG2  cell  receptor  protein. [19,  20]  It  is  encouraging  to  see  the  Rosetta  scoring  function  and 
flexibility  introduced  with  NAMD  drive  the  DS23  peptide  toward  these  active  sites,  and  highly  auspicious  that  binding 
features  introduced  within  the  dynamics  portion  of  the  protocol  (such  as  additional  hydrogen  bonding)  do  not  disappear 
under  docking  repacking  and  reminimization.  However,  as  all  putative  binding  sites  are  patently  hydrophobic,  we 
question  whether  the  subtleties  leading  to  prediction  of  specific  binding  are  clouded  by  the  energy  expressions  used. 
Work  probing  this  question  is  ongoing. 


V  V  L  f  •/ 


Or  -v  O. 

V  y> 


zy 


/  f/  Av 


Figure  4.1.1  Top  docked  structure  binding  to  the  domain  2  loop.  The  DS23 
peptide  is  clearly  visible  in  the  lower  lefthand  side.  Coloring  is  by  Eisenberg 
hydrophobicity  scale  from  red  ( hydrophobic )  to  blue  ( hydrophilic ). 
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The  basic  metrics  provided  demonstrate  a  marked  improvement  in  our  ability  to  generate  a  priori  predictive  docked 
structures.  Larger  magnitude  (more  negative)  interface  energies  and  a  greater  number  of  hydrogen  bonds  indicate  the 
effectiveness  of  the  protocol.  The  importance  and  frequency  of  hydrogen  bonding  in  bound  configurations  of  peptides  is 
illustrated  in  a  recent  survey  of  peptide-protein  complexes.  [21]  Displacement  values  show,  however  that  this  is  not  a 
magic  bullet.  Full  resolution  molecular  dynamics  provided  only  less  than  l.OA  RMSD.  Here,  we  acknowledge  the  aid 
in  PA  structural  relaxation  leading  to  improved  peptide  contact,  but  consider  that  large-scale  structural  moves  may  be 
better  represented  by  other  simulation  methodology. 

Our  work  on  the  DS23-PA  system  continues,  with  experimental  validation  of  proposed  peptide  docking  locations  and 
testing  mutations  of  the  peptide  for  binding  improvement.  From  the  software  perspective,  the  XPairlt  toolkit  is  currently 
being  extended  to  support  a  coarse-grained  atomistic  potential,  to  incorporate  a  more  rigorous  calculation  of  system 
electrostatics,  and  to  include  capability  for  quantum  mechanics. 
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